Table of Contents
Sometimes, there might be a situation where you need to insert a lot of test data into the database. Usually for testing purposes and performance in Laravel.
However, seeding a huge amount of records can take an exceptionally long time and is not ideal when you’re in a state where you have to frequently seed databases.
So, let’s try to improve the seeding time so that we won’t have a hard time waiting for it to finish.
Generating Random Data
Below, is a simple PostFactory.php file for generating fake data using Faker.
<?php
/** @var \Illuminate\Database\Eloquent\Factory $factory */
use App\Post;
use Faker\Generator as Faker;
$factory->define(Post::class, function (Faker $faker) {
return [
'title' => $faker->sentence,
'body' => $faker->paragraph(30),
'user_id' => rand(1, 10),
'category_id' => rand(1, 10),
'is_published' => rand(0, 1),
];
});
Now, let’s create 10K posts using the factory() helper function in PostSeeder to create records in your database.
So, this would be a typical scenario of how most people usually generate a huge number of test data for their project.
<?php
use Illuminate\Database\Seeder;
use App\Post;
class PostSeeder extends Seeder
{
/**
* Run the database seeds.
*
* @return void
*/
public function run()
{
// Remove all the records
Post::truncate();
factory(Post::class, 10000)->create();
}
}
Then, we execute the db:seed
command.
php artisan db:seed
Seeding: PostSeeder
Seeded: PostSeeder (8.1 seconds)
Database seeding completed successfully.
As you can see, this approach of seeding large amounts of data can take time. But, there’s another way of doing the same thing more quickly and conveniently.
Optimizing Database Seeding using chunk() and insert()
Notice! Starting from Laravel 7, the default format for dates is datetime with timezone.
For this to work, first you need to add the property below in your model because when converting the eloquent model into an array using toArray(). By default casts for created_at and updated_at are set to DateTime with timezone.
namespace App;
use Illuminate\Database\Eloquent\Model;
class Post extends Model
{
protected $cast = [
'created_at' => 'datetime',
'updated_at' => 'datetime',
];
//..
First, we replace create() with make() which will return the Laravel Collection of the model. Then, store it in a variable instead of directly creating it on the fly.
Second, we divide this large collection of our data stored in the variable into small chunks using the chunk() function provided by Laravel Collection.
Finally, instead of inserting just one record at a time. We can directly insert each of these chunks after transforming them into an array as a batch insert into our database.
<?php
use Illuminate\Database\Seeder;
use App\Post;
class PostSeeder extends Seeder
{
/**
* Run the database seeds.
*
* @return void
*/
public function run()
{
// Remove all the records
Post::truncate();
$posts = factory(Post::class, 10000)->make();
$chunks = $posts->chunk(2000);
$chunks->each(function ($chunk) {
Post::insert($chunk->toArray());
});
}
}
php artisan db:seed
Seeding: PostSeeder
Seeded: PostSeeder (3.91 seconds)
Database seeding completed successfully.
Finally, you can see that we have successfully reduced the time for seeding literally in half the time before.
Really useful
Thank you so much
Thank you, using chunks helped me to get my db:seed from 16 minutes to 14 seconds (see https://laracasts.com/discuss/channels/laravel/why-is-dbseed-so-slow-with-laravel?page=1&replyId=746610 for full story). However, I am surprised that inserting 10k rows sequentially only takes 8.2 seconds for you. Whats your default setup? If I am using Laravel 8 with Sail, then inserting 10k rows takes 147 seconds for me with intel7 and 36 GB ram.