Its a really troublesome task to update mass data in production. One of the most used method to do this task is:
collect($csvContacts)->each(function (array $row) {
$contact = Contact::updateOrCreate(
['email' => $row['email']],
['name' => $row['name'], 'address' => $row['address']]
);
});
updateorCreate()
method basically has two query within it. At first the query, checks if there are any rows that matches the key/value in first array. If the row exists, it will update it otherwise a new row is inserted in the table.
This method is suitable for database having small scale data. Imagine if we had more than 10000 rows which means 20000 queries will have to be executed. This will take a lot of time and make system slow which is not efficient at all. This is where upsert()
comes in action which has higher efficiency than updateorCreate()
.
upsert()
method allows us to perform multiple “upserts” in a single query. It takes value to insert or update in first argument and second argument lists the column(s) that uniquely identify records within the associated table. Third argument consist consists of column that should be updated if matching record already exists.
How to use Upsert?
Single query containing all the rows are made while using upsert which includes on duplicate key update
in MySQL and on conflict ... do update set
in Postgres. This command instructs database to update record if it already exists.
Another important thing to notice while using upsert method is to have a primary or unique key index. If no primary or unique index is found in the database then we will get error.
For higher efficiency, you can also chunk queries into block. We might hit query size limit if it is not done. Here is an example of proper upsert method usage:
collect($csvContacts)
->map(function (array $row) {
return Arr::only($row, ['email', 'name', 'address']);
})
->chunk(1000)
->each(function (Collection $chunk) {
Contact::upsert($chunk, 'email');
});