Sunday, June 20. 2010
Triggers, NOT FOR REPLICATION Identity, and Merge Replication
Initial Note/Warning
I hope that there is a more elegant, simple, and robust method for dealing with the issues described in this post than the one that I present herein. My method is a rather ugly kludge that relies on undocumented features of SQL Server and is not nearly as parallelizable as I would like. If you are aware of a better method (which I hope exists), please don't hesitate to post it in the comments. Until a better method is posted, I invite you to use and/or learn from the method presented in this post.
Background
Consider an SQL Server 2005 database which is published for merge
replication from a single publisher with multiple subscribers. The database
contains 2 tables which we are considering:
Forms
and
InboxForms
.
Forms
contains some sort of form data and
InboxForms
references all forms which are
present in a user's "inbox". Each of the tables contains an
INT IDENTITY
primary key column and several other
columns of data that are not relevant to the problem at hand. The publication
is filtered based on the Forms that each user is able to view (determined by
some separate mechanism not described here). When a new row is inserted into
Forms
, a trigger is used to route the forms
into the appropriate inbox(es).
The Problem
The routing can not occur (exclusively) at the subscribers, because the
filter for the subscriber will not necessarily include
user information for the recipient of the form and the form can not be placed
into InboxForms
if it recipient user does not
exist on the subscriber. So, the trigger must run on the publisher during
synchronization when inserts
are performed by the merge agent (i.e. the trigger must not be marked
NOT FOR REPLICATION
). However, in this configuration,
when the merge agent runs, the following error message is produced:
Explicit value must be specified for identity column in table 'InboxForms' either when IDENTITY_INSERT is set to ON or when a replication user is inserting into a NOT FOR REPLICATION identity column.
The problem is that the INT IDENTITY
column in
InboxForms
was marked NOT FOR
REPLICATION
when the publication was created in order to facilitate
automatic identity range management, which is described in the
Replicating
Identity Columns article in the SQL Server Books Online.
NOT FOR REPLICATION
behaves very similarly to
IDENTITY_INSERT
(as hinted at in the error message), such that when a row is inserted by the
merge agent, the ideneity seed value is not incremented and the value of the
identity column must be explicitly specified. Note, however, that it is not
the same mechanism as IDENTITY_INSERT, so changing IDENTITY_INSERT in the
trigger will not remove the requirement for explicitly specified identity
values.
The Solution
One method to solve this problem is to disable NOT FOR
REPLICATION
, as suggested in
KB908711 (which
specifically addresses this issue). However, using this option will
interfere with automatic identity range management, since the identity values
generated on the subscribers can not be copied to the publisher, and other
steps will need to be taken to manually manage identity values. For me, this
was an unacceptably high price to pay and another solution was required.
A solution which does not interfere with automatic identity range
management is to calculate values for the identity columns and explicitly
specify those values when they are required. In order to accomplish this,
understanding several features of T-SQL is required: In order to determine
when the values are required, the trigger needs to test if it is being run
from the merge replication agent. This can be done by testing for the
'replication_agent'
SESSIONPROPERTY
.
In order to determine appropriate values for the identity column, use
IDENT_CURRENT
and
IDENT_INCR
.
Note that using the maximum value for the identity column is not necessarily
correct because the maximum identity range will not necessarily be allocated
to the publisher.
DBCC
CHECKIDENT
can be used to update the identity seed value (which
is not affected by explicitly inserted identity values).
One other complicating factor in our implementation is that there is no way to atomically insert explicit identity values and update the identity seed value. Therefore, locking is required to prevent multiple connections from simultaneously updating the values and causing collisions. (Or collision errors must be caught and retried) In the following implementation, an exclusive table lock is acquired which prevents any inserts from occurring on the table when the trigger is running. This is a serious performance problem as it prevents any other operations on the locked table from completing while the trigger is executing. Keep this in mind when designing the queries that will run while the lock is held.
Now, without further ado, here's the trigger:
ALTER TRIGGER TR_RouteForms
ON dbo.Forms
AFTER INSERT
AS
BEGIN
-- Client isn't expecting routing counts from their insert
SET NOCOUNT ON;
IF SESSIONPROPERTY('replication_agent') <> 0
BEGIN
-- Running from the replication agent
-- Need explicit value for NOT FOR REPLICATION IDENTITY columns
-- Use transaction to limit lock scope
BEGIN TRAN
-- Variables for IDENT_CURRENT and IDENT_INCR, required
-- because DBCC CHECKIDENT syntax won't support nested parens
DECLARE @Ident INT, @Incr INT;
-- RowCnt used to preserve @@ROWCOUNT
DECLARE @RowCnt INT;
-- Must acquire exclusive lock on InboxForms to prevent other
-- inserts (which would invalidate the identity and cause
-- collisions in the identity column).
-- Select into variable to prevent resultset going to client
-- WHERE clause quickly evaluated, returns small (empty) result
DECLARE @Dummy INT;
SELECT @Dummy = InboxFormID
FROM InboxForms WITH (TABLOCK, XLOCK, HOLDLOCK)
WHERE InboxFormID = 0;
-- Perform the form routing (inserts into InboxForms)
SET @Ident = IDENT_CURRENT('InboxForms');
SET @Incr = IDENT_INCR('InboxForms');
INSERT INTO InboxForms (InboxFormID, FormID, ...)
SELECT @Ident + @Incr * ROW_NUMBER() OVER (ORDER BY FormID) AS InboxFormID, FormID, ...
FROM inserted
WHERE ...routing criteria...
SET @RowCnt = @@ROWCOUNT;
IF @RowCnt > 0
BEGIN
-- At least 1 form was routed, update the identity seed value
-- Note: Can't use MAX(InboxFormID) since publisher may not
-- have been allocated the maximum identity range
SET @Ident = @Ident + @Incr * @RowCnt;
DBCC CHECKIDENT (InboxForms, RESEED, @Ident)
WITH NO_INFOMSGS;
END
COMMIT TRAN
END
ELSE
BEGIN
-- NOT running from the replication agent
-- Can insert normally into NOT FOR REPLICATION IDENTITY columns
-- Perform the form routing (inserts into InboxForms)
INSERT INTO InboxForms (InboxFormID, FormID, ...)
SELECT @Ident + @Incr * ROW_NUMBER() OVER (ORDER BY FormID) AS InboxFormID, FormID, ...
FROM inserted
WHERE ...routing criteria...
END
END