mysql

Upsert with conditional data update? oh my(sql)

Posted on

Found an interesting scenario from an app using mysql where:

  1. need to perform upsert (insert or update when same data/keys already exists)
  2. need to do the upsert in bulk
  3. when performing the update, also need to only update the data based on certain condition (e.g. based on timestamp)

After some googling and SO-ing, managed to create the sample sql snippet to get the above done.

The sample upsert_table definition:

create table upsert_table (
id int(11) unsigned not null primary key auto_increment,
unkey1 varchar(10) not null,
unkey2 varchar(10) not null,
val1 varchar(100) not null,
val2 varchar(100) not null,
lastmodified int(11) unsigned,
unique(unkey1, unkey2)
)

Note: unlike in some databases where the upsert can be done based on a certain condition, in mysql, the upsert is determined by the unique index or primary key, refer to the excerpt from the official documentation:

If you specify an ON DUPLICATE KEY UPDATE clause and a row to be inserted would cause a duplicate value in a UNIQUE index or PRIMARY KEY, an UPDATE of the old row occurs.

Let’s populate the table with the following data:

+----+--------+--------+-----------+----------+--------------+
| id | unkey1 | unkey2 | val1      | val2     | lastmodified |
+----+--------+--------+-----------+----------+--------------+
|  1 | A      | 11     | Meeting A | room 1-1 |   1503662401 |
|  2 | B      | 22     | Meeting B | room 2-2 |   1503662402 |
|  3 | C      | 33     | Meeting C | room 3-3 |   1503662403 |
|  4 | D      | 44     | Meeting D | room 4-4 |   1503662404 |
|  5 | E      | 55     | Meeting E | room 5-5 |   1503662405 |
+----+--------+--------+-----------+----------+--------------+

Execute the bulk upsert as well as conditional data update:

insert into upsert_table
	(unkey1, unkey2, val1, val2, lastmodified)
values
	('A', '11', 'Updated Meeting A', 'room 1-10', UNIX_TIMESTAMP('2017-08-25 20:00:11')),
	('B', '22', 'Should not update Meeting B', 'room 2-20', UNIX_TIMESTAMP('2017-08-25 19:00:02')),
	('F', '66', 'New Meeting F', 'room 6-6', UNIX_TIMESTAMP('2017-08-25 20:00:06'))
on duplicate key update
	val1 = if (lastmodified < values(lastmodified), values(val1), val1),
	val2 = if (lastmodified < values(lastmodified), values(val2), val2),
	lastmodified = if (lastmodified < values(lastmodified), values(lastmodified), lastmodified);

Here’s the final result:

+----+--------+--------+-------------------+-----------+--------------+
| id | unkey1 | unkey2 | val1              | val2      | lastmodified |
+----+--------+--------+-------------------+-----------+--------------+
|  1 | A      | 11     | Updated Meeting A | room 1-10 |   1503662411 |
|  2 | B      | 22     | Meeting B         | room 2-2  |   1503662402 |
|  3 | C      | 33     | Meeting C         | room 3-3  |   1503662403 |
|  4 | D      | 44     | Meeting D         | room 4-4  |   1503662404 |
|  5 | E      | 55     | Meeting E         | room 5-5  |   1503662405 |
|  6 | F      | 66     | New Meeting F     | room 6-6  |   1503662406 |
+----+--------+--------+-------------------+-----------+--------------+

What just happened?

  1. row with key “A”-“11” was updated because data with same unique key existed and new timestamp is greater than existing timestamp
  2. row with key “B”-“22” was NOT updated although data with same unique key existed because the new timestamp is lesser than existing timestamp (actually it’s just updated with the existing value if you look the query closely)
  3. row with key “F”-“66” should be inserted because data with same unique key does not exist

Not forgetting to thank to the references below
https://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
https://stackoverflow.com/questions/32777081/bulk-insert-and-update-in-mysql
https://thewebfellas.com/blog/conditional-duplicate-key-updates-with-mysql

Get the full gist from -> https://gist.github.com/bembengarifin/9ea92f16eeb308ad5675fdc0995f4d1b