Support the ongoing development of Laravel.io →

Laravel - Eager loading can be bad!

29 Feb, 2024 7 min read

Photo by Getty Images on Unsplash

Hello 👋

Yes, you read it right. Eager loading can be bad, really bad. However, we often resort to it when dealing with an N+1 scenario, thinking that we've resolved the issue, when in fact, we might have made it worse. How? Let's see.

How bad it gets

For this demo, we are building Laravel Forge. Like (almost) every Laravel application, we will have a One To Many relationship.

We aim to log every activity for a server. A log can include the activity type, the user who initiated it, and other useful information for later analysis.

<?php

namespace App\Models;
 
use Illuminate\Database\Eloquent\Model;
use Illuminate\Database\Eloquent\Relations\HasMany;
 
class Server extends Model
{
    // ...

    public function logs(): HasMany
    {
        return $this->hasMany(Log::class);
    }
}

Now, in the application, we want to list all the servers. So, we might do something like

<!-- It's a fancy table, use your imagination... -->

<table>
    <tr>
        <th>Name</th>
    </tr>
    @foreach ($servers as $server)
    <tr>
        <td>{{ $server->name }}</td>
    </tr>
    @endforeach
</table>

Moving forward, we have 10 servers, and each of them has 1000 logs.

So far, so good. Now, we want to display when the last activity on a server occurred

<table>
    <tr>
        <th>Name</th>
        <th>Last Activity</th>
    </tr>
    @foreach ($servers as $server)
    <tr>
        <td>{{ $server->name }}</td>
        <td>
            {{ $server->logs()->latest()->first()->created_at->diffForHumans() }}
        </td>
    </tr>
    @endforeach
</table>

Basic things, we access the logs() relation, ordering it to retrieve the latest record, getting the created_at column, and formatting it for better readability using diffForHumans(). The latter yields something like "1 week ago".

But this is bad, we've introduced an N+1 problem.

If you don't know what a N+1 is, we are running the following queries

-- 1 query to get all the servers
select * from `servers`

-- N queries for each of servers
select * from `logs` where `logs`.`server_id` = 1 and `logs`.`server_id` is not null order by `created_at` desc limit 1
select * from `logs` where `logs`.`server_id` = 2 and `logs`.`server_id` is not null order by `created_at` desc limit 1
-- ...
select * from `logs` where `logs`.`server_id` = 10 and `logs`.`server_id` is not null order by `created_at` desc limit 1

To resolve this issue, we typically reach out to Eager Loading (I know you did).

// In your controller
$servers = Server::query()
    ->with('logs')
    ->get();

// In your blade
<table>
    <tr>
        <th>Name</th>
        <th>Last Activity</th>
    </tr>
    @foreach ($servers as $server)
    <tr>
        <td>{{ $server->name }}</td>
        <td>
            {{ $server->logs->sortByDesc('created_at')->first()->created_at->diffForHumans() }}
        </td>
    </tr>
    @endforeach
</table>

With this update, we manage to reduce it to only 2 queries

-- 1 query to get all the servers
select * from `servers`

-- 1 query to get all the related logs
select * from `logs` where `logs`.`server_id` in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

And it looks like we addressed the problem, right?

Wrong! We're only considering the number of queries. Let's examine the memory usage and the count of loaded models; these factors are equally important.

  • Before eager loading
    • 11 queries: 1 to retrieve all servers and 10 queries for each server.
    • A total of 20 models loaded.
    • Memory usage: 2MB.
    • Execution time: 38.19 ms.

  • After eager loading
    • 2 queries: 1 to get all servers and 1 to get all logs.
    • A total of 10010 models loaded 🤯.
    • Memory usage: 13MB (6.5x increase).
    • Execution time: 66.5 ms (1.7x increase).
    • Slower computational time due to loading all the models 🐢.

The tool in the screenshot is Debugbar.

It looks like we didn't fix anything; in fact, we made it worse.. And keep in mind, this is a very simplified example. In a real-world scenario, you can easily end up with hundreds or thousands of records, leading to the loading of millions of models.. The title make sense now?

How do we truly solve this?

In our case, eager loading is a NO-NO. Instead, we can use sub-queries and leverage the database to perform tasks it is built and optimized for.

$servers = Server::query()
    ->addSelect([
        'last_activity' => Log::select('created_at')
            ->whereColumn('server_id', 'servers.id')
            ->latest()
            ->take(1)
    ])
    ->get();

This will result in a single query

select `servers`.*, (
        select `created_at`
        from `logs`
        where
            `server_id` = `servers`.`id`
        order by `created_at` desc
        limit 1
    ) as `last_activity`
from `servers`

Since the column we need from the relationship is now computed in a subquery, we have the best of both worlds: only 10 models loaded and minimal memory usage.

You might be thinking that with this approach comes a drawback: the last_activity column is now a regular string. So, if you want to use the diffForHumans() method, you'll encounter the Call to a member function diffForHumans() on string error. But no worries, you haven't lost the casting; it's as simple as adding a single line.

$servers = Server::query()
    ->addSelect([
        'last_activity' => Log::select('created_at')
            ->whereColumn('server_id', 'servers.id')
            ->latest()
            ->take(1)
    ])
    ->withCasts(['last_activity' => 'datetime']) // casts here
    ->get();

By chaining the withCasts() method, you can now treat the last_activity as if it were a date.

How about the Laravel way?

The reddit community never disappoints! They have pointed out another alternative solution, a Laravel-ish approach; One Of Many.

Let's define a new relationship to always retrieve the latest log

// In the Server Model
public function latestLog(): HasOne
{
    return $this->hasOne(Log::class)->latestOfMany();
}

Now we can use the relationship like this

// In the Controller (or action..)
$servers = Server::query()
    ->with('latestLog')
    ->get();

This will result in the following queries

select * from `servers`

select `logs`.*
from
    `logs`
    inner join (
        select MAX(`logs`.`id`) as `id_aggregate`, `logs`.`server_id`
        from `logs`
        where
            `logs`.`server_id` in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
        group by
            `logs`.`server_id`
    ) as `latestOfMany` 
    on `latestOfMany`.`id_aggregate` = `logs`.`id`
    and `latestOfMany`.`server_id` = `logs`.`server_id`

And it can be used in the Blade like this

// In the Blade view
@foreach ($servers as $server)
    {{$server->latestLog }}
@endforeach

For a comparison between the two methods:

  • Using subqueries
    • 1 query.
    • A total of 10 models loaded.
    • Memory usage: 2MB.
    • Execution time: 21.55 ms.

  • Using the latestOfMany()
    • 2 queries
    • A total of 20 models loaded.
    • Memory usage: 2MB.
    • Execution time: 20.63 ms

Both methods are really good; which one to use will depend on your case. If you absolutely need the child model hydrated and will make use of all its fields, go with the latestOfMany(). However, if you only need a few fields, then the subquery will perform better. This is because, in the subquery, you select exactly what you need. Regardless of the number of records you have, the memory usage will be almost the same. Now, for the second method, memory usage is heavily dependent on the number of columns your table has. In reality, a table can easily have 50 columns, so hydrating the model will be expensive, even if it is only one per parent, that is to keep in mind when choosing!

Conclusion

I have seen some developers, by design, choose to force eager loading for all the models. You can't just use it for everything, as much as it seems like you've solved the issue, you might have actually created a worse one. Not everything is a nail; the hammer might not work 🔨

Last updated 3 weeks ago.

driesvints, sajibadhi, speed, antoniputra liked this article

4
Like this article? Let the author know and give them a clap!
oussamamater (Oussama Mater) I'm a software engineer and CTF player. I use Laravel and Vue.js to turn ideas into applications 🚀

Other articles you might like

November 18th 2024

Laravel Custom Query Builders Over Scopes

Hello 👋 Alright, let's talk about Query Scopes. They're awesome, they make queries much easier to r...

Read article
November 19th 2024

Access Laravel before and after running Pest tests

How to access the Laravel ecosystem by simulating the beforeAll and afterAll methods in a Pest test....

Read article
November 11th 2024

🍣 Sushi — Your Eloquent model driver for other data sources

In Laravel projects, we usually store data in databases, create tables, and run migrations. But not...

Read article

We'd like to thank these amazing companies for supporting us

Your logo here?

Laravel.io

The Laravel portal for problem solving, knowledge sharing and community building.

© 2024 Laravel.io - All rights reserved.