CREATE TABLE UserStack (
id int unsigned not null default 0,
activity medium int unsigned not null default 0,
state ENUM('ready','processing','processed', 'error'),
INDEX (state, activity)
) ENGINE = INNODB;
Now you have an application that is spread across many servers which pops a set of users off the stack and sets them to a processing state.
SELECT * FROM UserStack WHERE state='ready' ORDER BY activity DESC LIMIT 10 FOR UPDATE
Mark them Processing
If multiple threads do this at the same time; the 1st thread will get the 10 ids while 99% of the others will fail with "Deadlock Detected try restarting transaction"
The reason: from innodb's perspective many different clients are asking to perform actions on the same data all at the same time-thus a deadlock is detected. To exacerbate the problem the data is ordered differently from the cluster index-so in essence the entire table is scanned for this example. The table has 30 million rows.
I've tested transaction isolation levels:
How to get around this:
Lets add a column to make the rows unique for the calling process; lets add pid (add server int unsigned if you want to run the process from many servers).
The access pattern for the table is now going to involve pid and state. The column pid nor the combo with state is unique, so the user id which defines the user.
ALTER TABLE UserStack DROP PRIMARY KEY, ADD pid int unsigned NOT NULL DEFAULT 0, ADD PRIMARY KEY(state,pid,id), DROP INDEX state;
Then change SELECT ... FOR UPDATE to an UPDATE statement. UPDATE operations for some reason are better at concurrency then SELECT FOR UPDATE. The update scans the PRIMARY index and updates the selected amount of rows defined by the LIMIT.
UPDATE UserStack SET pid=getmypid(), state='processing' WHERE pid = 0 AND state='ready' LIMIT 10;
SELECT * FROM UserStack WHERE pid = getmypid() AND state='processing';
mark as completed
Rinse and repeat.
The desired affect is complete. Each thread can grab its own work and each thread is guaranteed a unique user or block of users to process. The act of marking rows (marking their territory) and grabbing the marked rows takes a fraction of seconds as it should.
A good write ups on deadlocks