Joseph Kain bio photo

Joseph Kain

Professional Software Engineer learning Elixir.

Twitter LinkedIn Github

I’ve been working with Ecto a lot lately and wanted to spend this post writing up something I’ve learned while trying to put together a query. I’m by no means a database expert and this is one of the more complex queries I’ve had to write so far.

In the private project I’m working on, we have a database schema that consists of rentals that are marked unavailable on certain dates. These unavailable date ranges are stored in an Unavailability model and associated through a has_many / belongs_to relationship. That is, a Rental has many unavalable_date_ranges and Unavailability belongs to a Rental like this:

Rental and Unavailability

When trying to find a rental I want the user to be able to search (query) for rentals that are available on a given day or range of days. This means finding rentals that have no associated Unavailability records that overlap the desired days.

A while back I had written a query that turned out not to work. It only worked for very simple cases but failed when a rental had more than one associated Unavailability. I set out to fix this.

Research and Experimentation

I’m not an SQL or Ecto expert so I set out to research what I needed to learn in order to solve my problem. After some googling around I came across the blog post “All, None, and One: The SQL Left Join Trick”. This post taught me, as it promissed, how to use a left join to find rows with no matching associations. I highly recommend reading the post but I’ll summarize what I took away from it.

A left join between Rental and Unavailability will contain all of the rows from Rental even if they have no matching unavailable_date_ranges. And we can add additional criteria so that the results has only those rentals with no matching unavalable_date_ranges, that is those that are available.

I played around with this technique and was able to build up this query:

SELECT * FROM rentals
    LEFT OUTER JOIN unavailabilities
      ON (unavailabilities.rental_id = rentals.id)
  WHERE rental_id IS NULL;

This would return all the colums for rentals where there are no associated unavalable_date_ranges. The ON clause joins by the association. The WHERE clause filters the result to those rentals without any matching unavalable_date_ranges.

But this doesn’t get me exactly what I want. I need to add in a condition to check for overlap. I know how to write the condition:

(unavailabilities.start, unavailabilities.end) OVERLAPS
  ('2016-01-31'::date, '2016-02-05'::date)

but not where to put it. After some failed experiments and more googling I came to this stack overflow answer which helped me to work out the right SQL syntax. Here’s the SQL query I ended up with:

SELECT * FROM rentals
    LEFT OUTER JOIN unavailabilities
      ON (unavailabilities.rental_id = rentals.id)
      AND (unavailabilities.start, unavailabilities.end)
        OVERLAPS ('2016-01-31'::date, '2016-02-05'::date)
  WHERE rental_id IS NULL;

Working within the Postgres console I was able to confirm that this works for my use cases.

Build the Ecto Query from the SQL Query

The next step was to take the SQL query I have build and turn it into an Ecto query. Of course, I could try to use the raw SQL in a fragment but it would be nice to use a plain Ecto query if possible.

While doing this work I found the post “Diving Into Ecto Part 2” very helpful. As was the Ecto documentation for join.

Just as with the raw SQL the one thing that I had a little trouble with was adding the overlaps condition in the right place. Most join examples use assoc as in this example from the docs:

from p in Post,
  left_join: c in assoc(p, :comments),
  select: {p, c}

I wanted to combine the association with the overlap condition. But, this doesn’t work. Instead I had to write out the association condition by hand, which is similar to what we see in the SQL.

I ended up with this function to build the query:

@doc """
Query rentals available on specific date range.

* `query` - Initial query to start with.  Only, rentals included in this
   query will be considered.
* `s` - starting date to check for availablity
* `e` - ending date to check for availablity
"""
def available_between(query \\ Rental, s, e) do
  s = parse_date(s)
  e = parse_date(e)

  from rental in query,
    left_join: range in Unavailability,
      on: (range.rental_id == rental.id) and overlaps(range, ^s, ^e),
    where: is_nil(range.rental_id)
end

defmacro overlaps(range, s, e) do
  #  Uses the Postgres specific OVERLAPS function
  quote do
    fragment("(?, ?) OVERLAPS (?, ?)",
             unquote(range).start, unquote(range).end,
             type(unquote(s), Ecto.Date), type(unquote(e), Ecto.Date))
  end
end

The condition (range.rental_id == rental.id) plays the same role as assoc(rental, :unavailable_date_ranges) and then I combined that with overlaps(range, ^s, ^e) to setup the left join. I think this resembles the SQL quite closely.

Here’s the SQL generated by Ecto as shown in the logs:

SELECT r0."id", r0."title", r0."description", r0."location", r0."owner_id",
       r0."inserted_at", r0."updated_at" FROM "rentals" AS r0
    LEFT OUTER JOIN "unavailabilities" AS u1
      ON (u1."rental_id" = r0."id")
      AND (u1."start", u1."end") OVERLAPS ($1::date, $2::date)
  WHERE (u1."rental_id" IS NULL) [{2016, 1, 31}, {2016, 2, 5}]

Conclusion

In this post we explored a bit of SQL and I learned a bit about left joins and how to use the effectively. We also looked at building up Ecto queries from SQL. I learned a lot in doing this work and I hope you were able to learn from this post.