This post could have been titled:
SELECT ... FOR UPDATE doesn’t work on non-existent rows like you might think it does".
FastMail has been a long term user of the MySQL database server, and in particular the InnoDB storage engine. InnoDB provides many core database features (ACID, transactions, row-level locking, etc).
One thing that comes up from time to time is the desire to do the following sequence atomically:
The important point is you want to do this atomically. That is: if one process checks for the row, but another process is already doing the "expensive calculation", it blocks until that process completes and has inserted the row into the table.
So if you have a table:
CREATE TABLE foo ( Id INT NOT NULL PRIMARY KEY, Data BLOB );
The primary key ensures
Id is unique.
One thought would be that you could do a
SELECT and then
INSERT IGNORE if the row doesn’t exist. This doesn’t return an error if the
INSERT would result in a duplicate key, but does mean that the "expensive calculation" might be done multiple times.
After looking at some MySQL documentation, you might think a transaction with a
SELECT ... FOR UPDATE statement is what you want. This would check for the data row, but lock the row gap if it doesn’t exist. Excellent!
Unfortunately this doesn’t work.
Quoting directly from an answer by Heikki Tuuri (the original developer of InnoDB)
READ COMMITTED and REPEATABLE READ only affect plain SELECTs, not
SELECT … FOR UPDATE.
In the 4.1.8 execution:
1.mysql> select * from T where C = 42 for update; 2.mysql> select * from T where C = 42 for update; -- The above select doesn't block now. 1.mysql> insert into T set C = 42; -- The above insert blocks. 2.mysql> insert into T set C = 42; ERROR 1213: Deadlock found when trying to get lock; Try restarting transaction
the explanation is that the X lock on the ‘gap’ set in SELECT … FOR
UPDATE is purely ‘inhibitive’. It blocks inserts by other users to the
gap. But it does not give the holder of the lock a permission to insert.
Why the inhibitiveness: if we have three index records:
aab <first gap> aba <second gap> acc
there are two gaps there. Suppose user 1 locks the first gap, and user 2
locks the second gap.
But if ‘aba’ is delete-marked, purge can remove it, and these two gaps
merge. Then BOTH user 1 and user 2 have an exclusive lock on the same
gap. This explains why a lock on gap does not give a user a permission
to insert. There must not be locks by OTHER users on the gap, only then
is the insert allowed.
Oracle Corp./Innobase Oy
Although this post is over 12 years old, it appears to be still relevant to InnoDB today, as any quick testing will show.
In fact in some common scenarios, it’s even worse than you might expect. Say you have one table that has an auto increment primary key. You then use that created id to calculate and insert "expensive to calculate" data into another table with the same id. The first
FOR UPDATE locks the gap for any id’s after it as well.
T has no
C‘s > 41.
t1> select * from T where C = 42 for update; t2> select * from T where C = 43 for update; t1> mysql> insert into T set C = 42; -- The above insert blocks. t2> insert into T set C = 43; ERROR 1213: Deadlock found when trying to get lock; Try restarting transaction
There’s some more commentary online about this issue:
The options appear to be:
SELECT ... FOR UPDATEon that table
RELEASE_LOCKfunctions. If you include the primary key value(s) in the lock name, you can avoid having a global lock on the whole table, the lock names act effectively as row locks. Beware that MySQL earlier than 5.7.5 limits you to one lock per session
INSERT IGNOREand hope that most of the time it’s not a problem and that the wasted computation is rare.
Sometimes this is rather frustrating, but there don’t appear to be any other good options.
For the curious, when we originally started developing FastMail in 2000 we started with PostgreSQL. However at the time PostgreSQL had a limit of 8kb of data per-row. Because (again, at the time) we stored emails in the database, this was not viable and we switched to MySQL InnoDB. As these things are, it was only a short while later that we switched to IMAP and Cyrus for email storage.
Sometimes the state of a product is the sum of a series of seemingly inconsequential decisions at the time.
Upgrade your privacy and productivity and join the best in email.
Want more information? Visit our side-by-side comparison chart to learn more about why Fastmail is
a great alternative to Gmail.
Macon Gambill has jumped into his new role at Fastmail.
Are you considering hosting your own email? Learn more about the pros and cons from Fastmail.
Data breaches sound scary, but there are many ways to protect your data and take control of your privacy.