TL;DR
Database triggers usually are not concurrency-safe. The programmers should be aware of this, and be responsible to apply necessary locks to make it safe.
Constraints within a row are easy to validate. Both application-level validation and database constraints can do the job perfectly. SQL's ADD CONSTRAINT CHECK
can describe arbitrary constraints against the current row (it explicitly does not work for other rows). However, constraints across multiple rows are much more trickier:
- It is hard to check the constraints at the application level when your application is distributed and atomicity cannot be easily guaranteed across application instances.
- SQL only provides syntax to describe a very limited set of constraints, such as
UNIQUE
,EXCLUDE
,FOREIGN KEY
, etc. It is inadequate to describe arbitrary constraints for application semantics.
Many think database triggers straightly solve the problem: execute your SQL function to validate the constraints as a trigger. As the trigger is executed in the same transaction with the insertion or update, atomicity and consistency is 'automatically guaranteed'. I had thought so, until I found incorrect data passed though my validation trigger and saved in the database. In the course of resolving the issue, I came across incorrect Stackoverflow answers, unattended questions, and text book sections that discuss triggers without mentioning concurrency and isolation level. I realised that this might be something that many people misunderstood about, and it is worthy to put my findings together as a thorough discussion. (Most of this post is based on PostgreSQL. )
Are database trigger concurrency-safe?
A short answer: No. A more precise answer: it depends on your transaction isolation level.
In the SQL standard, the most restrict isolation level is SERIALIZABLE
, at which level all transactions run as if there is only one transaction at a time. As database triggers run in the same transaction as the event, when a database trigger is being executed, there is no other active transaction to violate atomicity. Thus, database triggers are concurrency-safe with the SERIALIZABLE
isolation level.
However, most DBMS do not use SERIALIZABLE
as default, plus it is not recommended to use SERIALIZABLE
in production. PostgreSQL is defaulted to READ COMMITTED
. A weaker isolation level allows better performance, but it also allows certain data anomalies. For example, READ COMMITTED
allows non-repeatable reads, that is saying, repeating reads of the same data in one transaction may have different results.
Let's use the Stackoverflow question mentioned above as an example. We want to validate the constraint that geometries in a table do not overlap with each other. We add a BEFORE INSERT OR UPDATE
trigger to execute a SQL function to check if any existing geometries overlap with the new row; abort if overlapping. The trigger code is shown as below:
<pre class="EnlighterJSRAW" data-enlighter-group="" data-enlighter-highlight="" data-enlighter-language="sql" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-theme="" data-enlighter-title="">CREATE OR REPLACE FUNCTION overlap_check() RETURNS TRIGGER
LANGUAGE plpgsql
AS $_$
DECLARE
overlap_ids int[];
BEGIN
SELECT array_agg(id) INTO overlap_ids FROM my_table WHERE
id <> NEW.id
AND geom && NEW.geom;
IF cardinality(overlap_ids) > 0 THEN
RAISE EXCEPTION 'OVERLAP: Found % overlapping with new record', overlap_ids;
END IF;
RETURN NEW;
END;
$_$;
CREATE TRIGGER overlap_trigger
BEFORE INSERT OR UPDATE ON my_table
FOR EACH ROW
EXECUTE PROCEDURE overlap_check();
The trigger is invoked within the same transaction before an insertion. Considering the transaction implication, the pseudo code looks like this:
<pre class="EnlighterJSRAW" data-enlighter-group="" data-enlighter-highlight="" data-enlighter-language="sql" data-enlighter-linenumbers="true" data-enlighter-lineoffset="" data-enlighter-theme="" data-enlighter-title="">START TRANSACTION
-- trigger function starts
-- S1:
SELECT array_agg(id) INTO overlap_ids FROM my_table WHERE
id <> NEW.id
AND geom && NEW.geom;
IF cardinality(overlap_ids) > 0 THEN
RAISE EXCEPTION 'OVERLAP: Found % overlapping with new record', overlap_ids;
END IF;
-- trigger function ends, we can insert the row
-- S2:
INSERT INTO my_table () VALUES (new row);
END TRANSACTION
S1 reads data to check if there are overlapping geometries. If there is no overlap, S2 inserts the data. These all happen in the same transaction. However, assuming the database runs with READ COMMITT
that allows 'non-repeatable reads', at S2, we can no longer assume our check at S1 still holds true. The isolation level simply does not guarantee repeatable reads through a transaction: other processes may insert geometries between S1 and S2, and they may possibly overlap with the new row here. But the insertion at S2 will still insert the new row without knowing the constraint is already broken.
Weaker isolation level puts the concurrency-safety back to the programmers' hands.
How to make triggers safe?
Knowing there might be race conditions, our question is: how can we make triggers concurrency-safe? Here I will present four different approaches that work in terms of correctness, and discuss their pros and cons.
Option 0: Rephrase your trigger to UNIQUE
and EXCLUDE
This technically is not an approach for implementing triggers. However, before diving into triggers and managing concurrency by ourselves, I think it is worthwhile to try match your constraints' semantics into the SQL standard UNIQUE
and EXCLUDE
. For example, the non-overlapping geometries constraint can be expressed with EXCLUDE
:
<pre class="EnlighterJSRAW" data-enlighter-group="" data-enlighter-highlight="" data-enlighter-language="sql" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-theme="" data-enlighter-title="">ALTER TABLE my_table
ADD CONSTRAINT my_table_no_overlapping_geometry
EXCLUDE USING GIST (
-- if we want to make sure that only geometries within a user do not overlap, include the following line. We will discuss this example later.
-- user_id WITH =,
geom WITH &&
);
I would strongly recommend reading through the manual about UNIQUE
and EXCLUDE
. Though they cannot express arbitrary constraints, they can handle some constraints across multiple rows with certain conditions and may be more powerful than you thought. However, if they are not powerful enough for your case, let's talk about triggers.
Option 1:SERIALIZABLE
transaction
As we have discussed, the concurrency problem is that the isolation level allows certain kinds of data anomalies, such as 'unrepeatable reads'. The easy way to fix it is to change the isolation level to SERIALIZABLE
. One can change the default isolation level to SERIALIZABLE
for the database, or use SET TRANSACTION
to set isolation level per transactions. The latter has a smaller performance impact, as it only treats transactions that you care with the SERIALIZABLE
level.
In practice, one issue arising from this approach makes it less ideal. As real SERIALIZABLE
is too restrict to handle high throughput, most databases make assumptions about the SERIALIZABLE
level. On PostgreSQL, SERIALIZABLE
works exactly the same as REPEATABLE READS
, except that in the case where the properties of SERIALIZABLE
will break, the transactions will be simply aborted to make sure all succeeding transactions are still compliant with SERIALIZABLE
(read here). So a SERIALIZABLE
transaction may arbitrarily fail in concurrent cases and an application-level retry mechanism is required to handle the failed cases. The retry time can be unbounded as there is no guarantee that the transaction will succeed.
For this reason, SERIALIZABLE
is generally not recommended to handle the concurrency-safety here. It is preferable to handle the concurrency within the database, instead of on the application level.
Option 2: Explicit table lock
As said earlier, it would be good if we can handle the concurrency safety within the database. PostgreSQL uses implicit locks to implement isolation levels, and it allows a LOCK TABLE
statement to explicitly acquire locks. We could use explicit table locks to give us the concurrency guarantee we want. There is a little background we need to fill in before diving into the solution.
- When executing most SQL statements, the database will acquire corresponding locks to make sure the concurrent behaviours match what the isolation level requires. For example,
SELECT
will acquire aACCESS SHARE
lock, andUPDATE
will acquire aROW EXCLUSIVE
lock. Do not worry too much about the lock names; there are some historical reasons for the names, and the names do not really tell what they actually mean. - A lock mode conflicts with a certain set of lock modes. The exclusiveness definition can be found here. A lock mode is not necessarily self-conflicting. Generally some locks are 'stronger' - it is exclusive with more lock modes.
- A lock needs to be granted before the statement can be executed. If other process holds the hold, the current process will be blocked and waiting for the lock release.
- Once a lock is granted, the process holds it until the current transaction finishes (either succeeds or fails). There is no explicit release.
- Lock contention only happens between different processes. If a process holds a lock, and it tries to acquire a new lock that is exclusive to the existing lock, the acquiring can still be successful, as it won't be self-conflicting. Thus two locks will be held by the process (though one is 'stronger' than the other).
Knowing this background, let's get back to the SQL pseudo code earlier, and add the locking behaviour to the pseudo code.
<pre class="EnlighterJSRAW" data-enlighter-group="" data-enlighter-highlight="" data-enlighter-language="sql" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-theme="" data-enlighter-title="">START TRANSACTION
-- L1:
IMPLICIT LOCK for INSERT - ROW EXCLUSIVE LOCK
-- trigger function starts
-- L2:
IMPLICIT LOCK for SELECT - SHARE LOCK
-- S1:
SELECT array_agg(id) INTO overlap_ids FROM my_table WHERE
id <> NEW.id
AND geom && NEW.geom;
IF cardinality(overlap_ids) > 0 THEN
RAISE EXCEPTION 'OVERLAP: Found % overlapping with new record', overlap_ids;
END IF;
-- trigger function ends, we can insert the row
-- S2:
INSERT INTO my_table () VALUES (new row);
END TRANSACTION
-- L1 and L2 are released
As we are executing an insertion, at the beginning of the transaction, PostgreSQL needs to acquire lock L1 for insertion. Then the trigger function gets executed, and lock L2 is granted for the select statement. There are a few things we need to note:
- L1 happens at the beginning of the transaction instead of just before S2. This can be observed by opening two Postgres clients, doing the transactions manually and observing the
pg_locks
table (an example could be found here). - L2 is a
SHARE
lock, and it does not conflict with L1 or self. That means other processes can do insertion while this process holds L1, and this is where the 'unrepeatable reads' behaviour comes from. - L1 is a
ROW EXCLUSIVE
lock, and not self-conflicting. That means two processes can holds this lock, and do insertion at the same time. This is why the trigger attempt below will cause deadlocks.
Understanding these, we can again look at the unattended mail list question: why using explicit lock in a trigger function will cause deadlocks. Let's further add the explicit lock L0 to the pseudo code as the question.
<pre class="EnlighterJSRAW" data-enlighter-group="" data-enlighter-highlight="" data-enlighter-language="sql" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-theme="" data-enlighter-title="">START TRANSACTION
-- L1:
IMPLICIT LOCK for INSERT - ROW EXCLUSIVE LOCK
-- trigger function starts
-- L0:
EXPLICIT LOCK - X LOCK
-- L2:
IMPLICIT LOCK for SELECT - SHARE LOCK
-- S1:
SELECT * FROM my_table WHERE existing geoms overlap the new row;
IF FOUND THEN
RAISE EXCEPTION;
END IF;
-- trigger function ends, we can insert the row
-- S2:
INSERT INTO my_table () values (new row);
END TRANSACTION
We wanted to introduce the explicit lock L0 so that the condition we verified at S1 stays true at S2, which means the lock mode X needs to conflict with the insertion lock L1 ('stronger' than L1). However, this will cause deadlocks: two processes may both have acquired L1 (as its not self-conflicting), and attempts to acquire L0. But L0 conflicts with the other process' L1, so both processes cannot proceed.
Fixing the problem is straight-forward: in the application code, start the transaction manually, then immediately acquire L0 lock, and do the insertion which will execute the trigger after L0 lock. This works correctly and cleanly. The code is shown as below.
<pre class="EnlighterJSRAW" data-enlighter-group="" data-enlighter-highlight="" data-enlighter-language="sql" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-theme="" data-enlighter-title="">BEGIN;
-- L0: EXPLICIT LOCK, the lock mode needs to be stronger than L1 ROW EXCLUSIVE, e.g. use SHARE ROW EXCLUSIVE
LOCK TABLE my_table IN SHARE ROW EXCLUSIVE LOCK;
-- L1: IMPLICIT LOCK for INSERT - ROW EXCLUSIVE LOCK
-- L2: IMPLICIT LOCK for SELECT (in the trigger) - SHARE LOCK
INSERT INTO my_table () VALUES (new row);
END TRANSACTION
Option 3: Finer grained advisory lock
The above approach works fine. But it has performance implication. L0 acquired by LOCK TABLE
is a table-level lock. It locks the whole table so that only one process can do the insertion transaction. In this example, it works exactly as we expect - we want to lock the whole table to check geometries overlapping.
However, let's say, our requirement is a little bit different: instead of making sure that geometries in the whole table do not overlap, we put geometries for different users in the table, and we want to make sure that geometries for one user do not overlap (different users may have overlapping geometries). That is saying when inserting geometries for a user, we only need to check part of the table that are related with the user. If we still use the above approach, a table-level lock from LOCK TABLE
locks the entire table and prevents possible concurrency of inserting geometries of different users at the same time, which could become a performance bottleneck.
PostgreSQL provides an application-defined lock called advisory lock. pg_advisory_lock()
can take any integer, and apply locking on the integer (we can also lock on strings by hashing strings into ints with hashtext(text)
). Advisory locks are held until the end of the transaction as table locks, but they can be released earlier with pg_advisory_unlock()
which allows more precise control from the programmer. With advisory locks, in our case, we can apply advisory lock with the user's id to prevent concurrent operations only within a user's data. With advisory lock, our trigger function would look like this:
<pre class="EnlighterJSRAW" data-enlighter-group="" data-enlighter-highlight="" data-enlighter-language="sql" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-theme="" data-enlighter-title="">CREATE OR REPLACE FUNCTION overlap_check() RETURNS TRIGGER
LANGUAGE plpgsql
AS $_$
DECLARE
overlap_ids int[];
BEGIN
-- app-defined advisory lock only locking on user_id
-- the lock will be released at the end of the transaction
PERFORM pg_advisory_lock(NEW.user_id);
SELECT array_agg(id) INTO overlap_ids FROM my_table WHERE
id <> NEW.id
AND geom && NEW.geom;
IF cardinality(overlap_ids) > 0 THEN
RAISE EXCEPTION 'OVERLAP: Found % overlapping with new record', overlap_ids;
END IF;
RETURN NEW;
END;
$_$;
Option 4: Finer grained implicit lock with UPSERT
This is quite an interesting approach, as an answer here. For our example, we create an extra helper table trigger_helper
,
<pre class="EnlighterJSRAW" data-enlighter-group="" data-enlighter-highlight="" data-enlighter-language="sql" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-theme="" data-enlighter-title="">CREATE TABLE trigger_helper (val int PRIMARY KEY);
and in our trigger function, we perform an UPSERT
about the user id to the trigger_helper
table. The pseudo code is as below.
<pre class="EnlighterJSRAW" data-enlighter-group="" data-enlighter-highlight="" data-enlighter-language="sql" data-enlighter-linenumbers="" data-enlighter-lineoffset="" data-enlighter-theme="" data-enlighter-title="">CREATE OR REPLACE FUNCTION overlap_check() RETURNS TRIGGER
LANGUAGE plpgsql
AS $_$
DECLARE
overlap_ids int[];
BEGIN
-- UPSERT with implicit lock on the trigger_helper table
INSERT INTO trigger_helper (val)
VALUES (NEW.user_id)
ON CONFLICT (val) DO UPDATE
SET val = NEW.user_id
WHERE FALSE;
SELECT array_agg(id) INTO overlap_ids FROM my_table WHERE
id <> NEW.id
AND geom && NEW.geom;
IF cardinality(overlap_ids) > 0 THEN
RAISE EXCEPTION 'OVERLAP: Found % overlapping with new record', overlap_ids;
END IF;
RETURN NEW;
END;
$_$;
An UPSERT
(INSERT
with ON CONFLICT DO UPDATE
clause) may insert a value into the table, or may update the value if it exists. The same as other SQL statements, UPSERT
implicitly acquires a lock to make sure the correctness, and the lock holds until the transaction is done. This implicit lock is row-based, which means for different user ids, we will lock on different rows in the trigger_helper
table so that their trigger functions and insertions can run in parallel.
The original Stackexchange answer stated that this implicit lock is 'much cheaper' (than a table-level lock, such as Option 2), and 'less intrusive'. However, it is unclear whether this approach is better than using advisory locks. AliCloud has a few posts (in Chinese) about using Postgres' advisory lock for high concurrency and high throughput scenarios, and showed that proper use of advisory locks can achieve better performance than table-level lock as well.
Summary
Triggers are a powerful tool to implement arbitrary constraints. However, they are not concurrency safe as many people may assume. We discussed why they are not concurrency safe, and how isolation level affects the concurrent execution. We discussed four different approaches to make trigger concurrency-safe. Using SERIALIZABLE
transaction is the easiest fix, but it requires application-level retry and may incur unbounded waiting time before a transaction succeeds. We discussed about Postgres' locks with regards to SQL statements, and how we can use explicit table lock to make triggers safe. However, table locks are coarse grained, and we further discussed two approaches with finer grained lock - the triggers are concurrency safe without the need to lock up the whole table. I reckon both advisory locks and UPSERT
locks are versatile and performant to handle most validation triggers.