Over the past forty or so Tuesdays — has it really been that many?! — I’ve written on a spread of topics. There’s a slight problem with this: sometimes I forget what I have and haven’t written about. Here’s a case in point for you: last week, I wrote about finder objects, and casually tossed in some scopes into my models. It turns out, I’ve never actually written about how scopes work, or what they really do!
I know, I know, that’s pretty terrible of me. I actually learned about scopes awhile ago, and now I use them fairly often in my applications. However, I got so used to writing them, that I never really thought that much about how they work behind the scenes. In fact, when I sat down to write this post, I had to go on a hunt into the Rails source code and Ruby blogosphere to figure out what was going on under the hood every single time I implemented a scope in my code.
The main reason that I like to use ActiveRecord scopes is because they allow us to specify commonly-used queries, but encapsulate these queries into methods, which we can then call on our models or association objects. However, my hunt lead me to find out that scopes have been around for awhile in Railsland, so they’re not exactly that new. But, what’s interesting about them is how their implementation has changed and grown with different releases of Rails. There’s also a lot of debate over how and when scopes are different from their counterparts, or simpler class methods. But what makes a scope exactly? Well, it’s finally time for us to hunt down the answer to that question.
The simplest of scopes
While developing applications, we often run into a situation where we want to select a particular group of objects (read: rows from our database) based on the characteristics that they share. The basic implementation of scopes can be summed up as this simple idea: being able to narrow down the objects that we want into a specific subset, based on some given parameters. We can tell ActiveRecord
(an Object Relational Mapper, or ORM) to select a certain group of objects by implementing a scope that is specific to a model. Luckily, scopes aren’t too difficult to define, and mostly adhere to a simple structure.
In our bookstore app for example, we have a Review
object, that we allow our users to write about the Book
s that they purchase through our store. For now, our Review
s belong to a User
, and they have some basic attributes which map to columns in our database, including a published_at
datetime attribute, that we set when our User clicks the submit button, which saves their “drafted” review and turns it into a “published” review.
However, one side effect of having this attribute (and effectively, two different states or “types” of reviews) is that we now have no obvious form of selecting only our “published” reviews — that is to say, reviews that have a published_at
date attribute set on them. How can we fix this? Well, we can write a class method that, when invoked, will run a query on our ActiveRecord
object and only return the reviews that have this attribute. If we did that, our model might look something like this:
1 2 3 4 5 6 7 |
|
Okay, that’s a good start. Remember that the implicit self
in the body of this class method is our Review
class, so we’re basically running Review.where('published_at IS NOT NULL')
. But now we run into another problem: this query isn’t all that specific, is it? What makes a published
review, exactly? Well, it’s not just the fact that the published_at
date should be set; we also need to account for the fact that some reviews could be set to be published in the future, at a later date. What we really want to select are our reviews that have a published_at
date that has already happened; in other words, a date which occurred in the past. We can modify our class method to account for this:
1 2 3 4 5 6 7 8 |
|
If we try out this class method, we can see the exact SQL that gets executed:
1 2 3 4 5 6 |
|
However, instead of writing this functionality into the body of a class method, we could accomplish the exact same thing by using a scope:
1 2 3 4 5 6 7 8 |
|
which allows us to invoke a method in the console that pretty much looks like the method we had before:
1 2 3 4 5 6 |
|
Okay, wait — what’s going on here?! How did that even happen? Well, let’s break it down:
- First, we’re using something called the
scope
method. This class method is defined within theActiveRecord::Scoping::Named
module. - Second, the
scope
class method requires two important arguments: a name for the scope, and a callable object that includes a query criteria. That last part about passing a callable object is pretty important, because only procs and lambdas are callable objects. In fact, that-> {}
syntax that we’re using is just another way of writing a lambda in Ruby. - Third, and most interestingly, the return value of our scope was an
ActiveRecord::Relation
object. This is significant becauseActiveRecord::Relation
objects are not eagerly-loaded — they’re lazily-loaded. Lazy-loading basically means that we’re never going to query to the database until we actually need to. What makes this really awesome is that lazy-loading allows us to call more methods (read: scopes galore!) on our returnedActiveRecord::Relation
object.
It looks like perhaps there’s some funky stuff going on here. But, all of these things still don’t really answer our burning question: why use a scope when we could just write a class method?!
Class methods by any other name
What’s in a scope? A class method by any other name would smell just as sweet! Oops, I got carried away there. Enough poetry, let’s talk prose. Or scopes, rather, and why we might want to use them.
We want to change the implementation of our published
class method such that it accepts an argument that makes our query more flexible. Let’s say that we want to be able to filter our Review
s by a specific publication date. We might now have a class method that looks like this:
1 2 3 4 5 6 7 8 |
|
The on
parameter would ideally be a Date
or a Datetime
object that would dynamically change the rows that we’ll query for in our database. This will behave exactly like we want it to, until…it breaks. How can we break this? Well, let’s say that we now want to order our published reviews by their position
attribute, which for the time being, is just an integer. No problem, we can do that, right?
1 2 3 4 5 6 7 8 9 10 |
|
Sure, no problem! This returns exactly what we’d expect. But what if we’re relying on this method elsewhere and somehow don’t pass in a parameter to our published
method. What happens then?
1 2 |
|
BOOM! Everything broke. Oops. What happened here? We tried to call the order
method on a falsy object (aka nil
). Obviously Ruby is unhappy, because it looks like Review.published(nil)
returns nil
, which doesn’t respond to a method called order
!
Now, let’s go fast forward to our new scope implementation in the Review class:
1 2 3 4 5 6 7 8 |
|
We’ve changed our callable object to accept a parameter, which is how we’re going to determine our published_at
date. We can be pretty certain that this will execute the same query if we pass an actual date to this scope. But what if we pass nil
again?
1 2 3 4 5 6 7 8 9 10 11 |
|
Well, would you look at that! It didn’t break! It ran our expected query, but because scopes return ActiveRecord::Relation
objects, it didn’t call order
on nil
, it just kept chaining on to our query. The first part of our query (responsible for finding any reviews that were published on a date) didn’t return anything, but the second part of our query (responsible for just ordering whatever got returned by our first query) did work! How, exactly? Well, it just so happens that calling a method on a blank ActiveRecord::Relation
object returns that same relation. An important thing to note: if we had a query that was scoping down our reviews to ones that were published on a date and ordering those objects by their position, we would have gotten an empty relation:
1 2 3 4 |
|
The above query narrows down our scope quite a bit, which we could do if we wanted to specify that to SQL. But in our case, our ORDER BY
clause isn’t grouped inside of the AND
, but instead exists outside of it, which is why we’re not getting an empty relation returned to us.
While we’re on the topic of relations, it’s also important to note that the method we have right now does not return an object to us! Relations are not objects! We’d need to explicitly query for a record if we wanted to return it:
1 2 3 4 5 |
|
Hopefully we should now be able to easily see that the order
method that we’re chaining on right there at the end could really be abstracted into its own scope! Let’s fix that, shall we?
1 2 3 4 5 6 7 8 9 10 |
|
Much better. Now we can just chain on our order
scope to our published
scope without ever having to worry that our scopes will break. But wait, there’s even more we can do with scopes!
Special scope tricks
Because scopes accept lambdas and procs, we can pass in different arguments. We did that before when we passed in a datetime parameter. But this kind of flexibility can be especially powerful, because we can do things like pass in limits:
1 2 3 4 5 6 |
|
This will run our same SQL query, but will add LIMIT 10
to the end of it. We can customize this scope further, or we can add more if we need to. We also might want to just perpetually apply a scope to all queries on a specific model. When we run into this situation, we can use the default_scope
method.
1 2 3 |
|
This will automatically append all of our SQL queries on this model with ORDER BY "review"."position" DESC
. What’s really nice about having a default scope is that we don’t need to write and perpetually call a method named something like by_published_date
on this model; it will be applied and invoked by default on all instances of this class.
According to the documentation, if we want to get super fancy with our default scope and have so much logic that it’s bursting from our callable object’s so-called seams, we can also define it in an alternate way as a class method:
1 2 3 4 5 6 7 8 |
|
We’re also not limited to just using the where
method! We can use plenty of other ActiveRecord::Relation
methods, such as joins
or includes
, which will eager load other relations when we want to. Here’s a handy scope we could add to our Shipment
model that we built out last week:
1 2 3 |
|
This is pretty cool because we’re using our default_scope
method to automatically eager-load our associated order
and line_items
on our shipment
without having to make two additional queries just to load them! As is the case with includes
, it might not always be a good idea to do this, since we could be loading more records than we want, or could get stuck with a n+1 situation on our hands. But if we know what we’re doing and are sure that this scope is necessary, it can be pretty powerful.
We can also merge two scopes together, which effectively allows us to mix and match different WHERE
conditions and group them together in SQL with an AND
:
1 2 3 4 |
|
which we can then merge into a single SQL query by chaining our scopes together:
1 2 3 4 |
|
We’ll notice that in this situation, our WHERE
clauses are grouped together with an AND
, which can help us when it comes to writing super specific queries.
tl;dr?
- ActiveRecord scopes give us a lot of flexibility, even though they are effectively defining a class method on a model. The fundamental difference between them however, is that scopes should always return an
ActiveRecord::Relation
object, which makes them forever chainable! - How does the
scope
method actually work? I’m not sure that I understand all of it, but perhaps you will! Check it out in the Rails source code! - There are a few great primers on writing effective scopes, like this one, and this other one.