This post could have been titled:
"Why SELECT ... FOR UPDATE
doesn’t work on non-existent rows like you might think it does".
FastMail has been a long term user of the MySQL database server, and in particular the InnoDB storage engine. InnoDB provides many core database features (ACID, transactions, row-level locking, etc).
One thing that comes up from time to time is the desire to do the following sequence atomically:
The important point is you want to do this atomically. That is: if one process checks for the row, but another process is already doing the "expensive calculation", it blocks until that process completes and has inserted the row into the table.
So if you have a table:
CREATE TABLE foo (
Id INT NOT NULL PRIMARY KEY,
Data BLOB
);
The primary key ensures Id
is unique.
One thought would be that you could do a SELECT
and then INSERT IGNORE
if the row doesn’t exist. This doesn’t return an error if the INSERT
would result in a duplicate key, but does mean that the "expensive calculation" might be done multiple times.
After looking at some MySQL documentation, you might think a transaction with a SELECT ... FOR UPDATE
statement is what you want. This would check for the data row, but lock the row gap if it doesn’t exist. Excellent!
Unfortunately this doesn’t work.
Quoting directly from an answer by Heikki Tuuri (the original developer of InnoDB)
http://forums.mysql.com/read.php?22,10506,11347#msg-11347
READ COMMITTED and REPEATABLE READ only affect plain SELECTs, not
SELECT … FOR UPDATE.In the 4.1.8 execution:
1.mysql> select * from T where C = 42 for update; 2.mysql> select * from T where C = 42 for update; -- The above select doesn't block now. 1.mysql> insert into T set C = 42; -- The above insert blocks. 2.mysql> insert into T set C = 42; ERROR 1213: Deadlock found when trying to get lock; Try restarting transaction
the explanation is that the X lock on the ‘gap’ set in SELECT … FOR
UPDATE is purely ‘inhibitive’. It blocks inserts by other users to the
gap. But it does not give the holder of the lock a permission to insert.Why the inhibitiveness: if we have three index records:
aab <first gap> aba <second gap> acc
there are two gaps there. Suppose user 1 locks the first gap, and user 2
locks the second gap.But if ‘aba’ is delete-marked, purge can remove it, and these two gaps
merge. Then BOTH user 1 and user 2 have an exclusive lock on the same
gap. This explains why a lock on gap does not give a user a permission
to insert. There must not be locks by OTHER users on the gap, only then
is the insert allowed.Best regards,
Heikki
Oracle Corp./Innobase Oy
Although this post is over 12 years old, it appears to be still relevant to InnoDB today, as any quick testing will show.
In fact in some common scenarios, it’s even worse than you might expect. Say you have one table that has an auto increment primary key. You then use that created id to calculate and insert "expensive to calculate" data into another table with the same id. The first FOR UPDATE
locks the gap for any id’s after it as well.
Say table T
has no C
‘s > 41.
t1> select * from T where C = 42 for update;
t2> select * from T where C = 43 for update;
t1> mysql> insert into T set C = 42;
-- The above insert blocks.
t2> insert into T set C = 43;
ERROR 1213: Deadlock found when trying to get lock; Try restarting
transaction
There’s some more commentary online about this issue:
The options appear to be:
SELECT ... FOR UPDATE
on that tableGET_LOCK
/RELEASE_LOCK
functions. If you include the primary key value(s) in the lock name, you can avoid having a global lock on the whole table, the lock names act effectively as row locks. Beware that MySQL earlier than 5.7.5 limits you to one lock per sessionSELECT
+ INSERT IGNORE
and hope that most of the time it’s not a problem and that the wasted computation is rare.Sometimes this is rather frustrating, but there don’t appear to be any other good options.
For the curious, when we originally started developing FastMail in 2000 we started with PostgreSQL. However at the time PostgreSQL had a limit of 8kb of data per-row. Because (again, at the time) we stored emails in the database, this was not viable and we switched to MySQL InnoDB. As these things are, it was only a short while later that we switched to IMAP and Cyrus for email storage.
Sometimes the state of a product is the sum of a series of seemingly inconsequential decisions at the time.
Upgrade your privacy and productivity and join the best in email.
Want more information? Visit our side-by-side comparison chart to learn more about why Fastmail is
a great alternative to Gmail.
Fastmail customer, Xavier, gets the most out of his email account by using Fastmail Identities, Aliases, and Masked Email!